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ABSTRACT 


This thesis documents the design and implementation of an efficient primal simplex 
capacitated transshipment network optimizer, SNET, written in the C programming 
language. It describes a general symbolic network algorithm, discusses fundamental 
decisions regarding data structures and essential functions and their relationship to the 
network algorithm, and then details SNET’s development. Development tools used in 
this project, including standard test problems, profilers, timing routines, external drivers, 
and debuggers, are also covered. 

The resulting solver, SNET, is quite fast on standard NETGEN test problems, ap- 
proximately twice as fast as a primal simplex network solver written in FORTRAN. The 
effect of tuning parameters on SNET’s performance is minimal. 
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THESIS DISCLAIMER 


The reader is cautioned that computer programs developed in this research may not 
have been exercised for all cases-of interest. While eve.y effort has been made, within 
the time available, to ensure that the programs are free of computational and logic er- 
rors, they cannot be considered validated. Any application of these programs without 
additional verification is at the risk of the user. 
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I. INTRODUCTION 


A minimum cost capacitated transshipment (min-cost) network can model many 
important problems that industry and the military. must solve repeatedly: sripment of 
commodities, assignment of personnel and resources to tasks and requirements, routing 
of vehicles, ar.d design of distribution and communications systems, to-name but a few. 

This report describes the development of SNET, a fast and efficient C program for 
solving minimum cost capacitated network problems, which the author wrote for his 
thesis research. It describes min-cost network models, documents key decisions, explains 
choices of data structures and information representations, and describes in detail the 
critical functions needed to implement the network optimization algorithm. The devel- 
opment tools that significantly aided in the design and implementation of SNET are also 
discussed, as is the effect-of tuning parameters. Performance tests comparing GNET and 
SNET for speed are included. 

The source code for SNET is not included with this report. To obtain a copy of 
SNET, please contact the author’s thesis advisor: Professor Gordon FI. Bradley. 


A. NETWORK PROBLEMS 

Minimum cost capacitated transshipment networks, hereafter referred to as 
networks, are a special class of linear programming (LP) problems. They can be used to 
model problems where 


e Each variable can be interpreted as the (integer) amount of commodity flow on a 
conduit or arc. 


¢ Each constraint can be interpreted as a point or node that interacts with com- 
modity flow arcs. 


¢ Each node may either supply commodity to the network, consume commodity from 
the network, or transfer commodity to another node. 


e Each arc is connected to two network nodes. 
¢ The total supply of commodity into the network equals the total consumption of 
commodity from the network. 


Examples of problems that mcet .. ese criteria abound in operational, strategic and 
planning arenas. Min-cost networks are frequently used to model the following general 
problems, which are common to both the military and industrial world: 


e Communication Networks 











¢ Personnel Assignments 
¢ Movement-of Units, Supplies, and Commodities 
¢ Logistic / Production-Planning 


¢ Financial Planning 


Additionally, there-are many specialized models unique to tke military community: 
e Wartime Allocation-of Aggregated Assets 
® Specialized Weapon to Target Assignments 
* Manpower Mobilization 


* Distribution. of Incelligence Collection Assets 


From these examples, it is clear that network models are not only imporiant, but 
can also be quite large. For:example, a military manpowe: mobilization. model to de- 
termine U.S. Army wartime officer assignments could have over one hundred-thousand 
nodes.and-one :nillion arcs. Thus, in solving network problems, the speed and efficiency 
of the solver are important considerations. 


B. NETWORK FORMULATIONS AND SOLVERS 

The network matrix formulation (Figure 1) is the same as the generai LP matrix for- 
mulation. However, the network constraint matrix has-a special property: its entries 
-can only be +1, -1 or 0, and each column must have exactly one +1 and.one -1. 

A network can also be interpreted as a directed graph. Each node in the graph is 
either exogenous, supplying flow into the network or demanding flow from it, or 
endogenous, transfer. ‘ng (or transshipping) flow through the network. Within the net- 
work total supply must equal.total demand. Each-node-acts as a-constraint,:as the sum 
of flow into and: out of each node must equal zero. Each arc acts as a. variable with a 
lower and upper Now capacity bound and a linear cost proportionai to its current rate 
of flow. Network vocabulary often uses the terms arc and variable interchangeably. 
An optimal network sclution transfers the flow through the network at the-minimum 
possible cost. More detailed information on networks can be cbtained from many 
standard linear- programming references. [Ref. 1: pp. 404-439] 

This paper concentrates-on-an implementation of the primal simplex algorithm, but 
efficient solvers exist for many network algorithms. GNET [Ref. 2] and RNET [Ref. 3] 
are sophisticated primal simplex programs. KILTER [Ref. 4] is a fast primal-dual im- 
plementation and RELAX-If{Ref. 5] uses the relaxation method. Each-of them has 
contributed to the-advance of practical network solvers. 











Minimize CX 
Subject to Ax =b 
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Where is the decision variable vector 
is the cost vector 
is the constraint coefficient matrix 
= is the right hand side constraint vector 
and u are the variable bound vectors 





Figure 1. Network Linear Program Matrix Formulation. 


C. TYPOGRAPHIC CONVENTIONS 

In this paper, C programming language functions and variable names will be dis- 
played in lower case boldface type. This convention accurately represents the actual 
variable and function names as the C language is case sensitive. Important terminology 


will be highlighted in italics. 














Il. NETWORK ALGORITHMS 


SNET is based on the symbolic algorithm, a specialized version of the bounded 
variable simplex method. The symbolic algorithm solves min-cost network problems 
using a graph interpretation of the model instead of the standard simplex tableau. 
Though most standard LP texts derive the bounded variable simplex method, many do 
not cover the symbolic algorithm. This chapter fills that void. It summarizes the critical 
information needed by the bounded variable simplex method, and then presents the 
symbolic algorithm, an alternative method of manipulating that information for net- 
works. : 


A. THE BOUNDED VARIABLE SIMPLEX ALGORITHM 

The tableau from the bounded variable simplex method contains four essential vec- 
tors of variables: _ basic variables, nonbasic variables at their lower bound, nonbasic 
variables at their upper bound, and the current values of the basic variables. With these 
four vectors and the information needed to begin the simplex method (the constraint 
matrix and cost vector), one can recreate the rest of the tableau; dual prices, reduced 
costs, and the value of the objective function; and hence, any basic feasible solution. 
[Ref. I: pp. 201-212} 

When the LP under consid :ration is a network, recreating the feasible solution is 
simplified. First, by convention, all exogenous network variables and arc costs are in- 
teger valued. If this convention is not natural to the problem, it is achieved by scaling. 
Second, since the entries in the constraint matrix of a network consist of only positive 
ones, negative ones, or zeros, the basis inversion and subsequent matrix multiplications 
to recreate dual prices and reduced costs are all integer operations (Ref. 6: pp. 305-306]. 

The symbolic algorithm provides an easier way to recreate the feasible solution. The 
Basis-Tree theorem, discovered by Koopmans and Hitchcock (Ref. 7], states that: 

A set of columns of the constraint matrix comprise a basis if and only if the corre- 
sponding set of arcs form a spanning tree. 
Thus, the collection of basic arcs (variables) may be interpreted as a spanning tree. One 
can associate with each basic arc its cost and current flow. If the nonbasic arcs and their 


costs are maintained in two lists, one for arcs at their upper bound and the other for arcs 

















at their lower bound, all the information needed to recreate the feasible solution is 
present. 
After an initial basic feasible solution is obtained, the bounded variable simplex 
method has four major steps: 
1, Calculate the dual price of each constraint and the reduced cost of each variable. 


2. Select a favorable variable to enter the basis. If none exists, then the solution is 
optimal. 


3. Determine which variable should leave the basis. If none would leave the basis, 
then the solution is unbounded. 


4, Adjust relevant variables and update the basis. 


The combination of steps | and 2 is referred to as selecting candidates. Steps 3 and 4, 
together, are referred to as pivoting the basis. The symbolic algorithm uses the same 
basic steps, but its calculations emanate from the graph representation of the spanning 
tree and not from matrix manipulation. 


B. THE SYMBOLIC ALGORITHM 

To obtain an initial feasible basis (spanning tree) for the symbolic algorithm, artifi- 
cial arcs connect each node to the root node. The root node is the only artificial node 
in the spanning tree. It corresponds to the redundant row, equal to minus the sum of 
the other rows, that can be appended to the constraint matrix. This row ensures that 
each artificial variable will each have exactly one +1 and one -1 in their column. Recall 
that a pure network problem_has exactly one redundant constraint, hence its basis would 
be singular and noninvertable. Therefore any feasible basis must contain at least one 
artificial arc, though-it may have zero flow. 

Artificial arcs for supply nodes flow from the supply node to the root node. Simi- 
larly, arcs for sink nodes flow from the root node to-the sink nodes. Artificial arcs for 
endogenous nodes may flow either way. Initially, all flow passes through the root. The 
cost for each artificial arc should be high, to drive it out of the basis. An artificial arc’s 
capacity can be set equal to its initial flow, so no-increase in its flow is possible. 

1. Dual Prices and-Reduced Costs 
Dual-prices (DP) are uniquely associated with each constraint in a general LP 
and, hence, with each node in a network. In the symbolic algorithm, a node’s DP can 
be interpreted as the cost that a unit of flow would incur while travelling from the node 
itself to any arbitrary node. The root node can serve as that arbitrary node. Therefore, 
a node’s DP can be equal to the total cost that a unit of flow would incur as it travels 











from the node to the root node. If an arc-flows from-the.current node to the-root node, 
then-its-cost is added to the DP. If the arc flows away from the root and towards the 
current node, then its cost is subtracted from the DP. 

The reduced cost of each basic arc is zero, as in every variant of the simplex 
method, To obtain the reduced cost of nonbasic arcs simply add the arc’s-cost to the 
DP of the arc’s head and subtract the DP of the arc’s tail. 

2. Entering Variable 

In a min-cost problem, a favorable variable -entering the basis has the potential 
to reduce the value of the objective function. An arc entering the basis at its lower 
bound with a negative reduced cost will decrease the objective function as its flow in- 
creases, Similarly, an arc entering the basis at its upper bound with a positive reduced 
cost will decrease the objective function as its flow decreases. The greater the magnitude 
of the reduced cost, the more favorable the arc. Therefore, arcs may be ranked or sorted 
by their reduced costs. 

At each pivot, the reduction of the objective function’s value is equal to the 
entering arc’s flow change, which may be zero, multiplied by its reduced cost. Although 
choosing the entering variable with the most favorable reduced cost will not insure the 
largest improvement in the objective function, extensive experimentation over many 
years has shown that, on average, it is best to choose-entering variables.this way. 

As before, if no favorable arcs can be found, the solution is optimal. 

3. Exiting Variable 

Selecting an arc to leave the basis is the next step in the symbolic algorithm. 
Recall that the basis arcs form a spanning tree. When the arc entering the basis is added 
to this tree, a unique cycle is formed [Ref. 8: pp. 32-33]. As flow is adjusted around this 
cycle, the feasible solution approaches optimality. Each of the cycle arcs must be ori- 
ented either with or against the incoming arc. If the incoming arc enters the basis at its 
lower bound, then-its flow can only increase. If flow increases on the incoming arc, it 
will increase on arcs oriented with the incoming arc and decrease on arcs oriented 
against it. If the incoming arc enters the basis at its upper bound, then its flow can only 
decrease. If flow decreases on the incoming arc, it will decrease on arcs oriented with 
the incoming arc and increase on arcs oriented against it. The arc which limits the 
change in flow induced by the incoming arc will be the exiting variable. 

Flow can be limited in-one of three ways. First, the incoming arc itself could 
reach its opposite bound. For example, if the incoming arc enters the basis at its lower 
bound, the flow around the cycle can change until the incoming arc reaches its upper 
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bound. Second, flow on an arc already in the basis can increase until that arc reaches 
its upper bound. Third, flow on an arc already in the basis can decrease until that arc 
reaches its lower bound. All flow changes are integer valued and it is quite possible that 
multiple arcs will limit flow simultaneously. In this case, any (one) flow limiting arc in 
the cycle may be chosen to leave the basis. 


AS in other simplex methods, if no arc reaches a limit the solution is unbounded; 
however, in practical problems this is unlikely to occur. 
4. Flow Adjustment 
Cycle flow can be adjusted after the limiting arc has been identified. During 
flow adjustment, the limiting arc’s flow will be driven to either its lower or upper bound. 
The arc is then removed from the basis and placed in the proper nonbasic list. This step 
completes the symbolic network algorithm. 

















I. FUNDAMENTAL DESIGN DECISIONS 


During SNET’s conceptual phase, several fundamental design decisions were made: 
design objectives,-choice of a programming language, and composition of essential data 
structures. These fundamental decisions impacted upon SNET’s development in two 
ways. 

First, they provided a common philosophical foundation. This foundation assures 
that the program, taken as a whole, is easier to understand and maintain than.a col- 
lection of less similar modules. A common design philosophy also eases the program- 
mer’s task by providing a standard frame of reference for each module. Easing the 
programmer's efforts leads to fewer coding errors and reduces development time. 

Second, the fundamental design decisions restricted the possible alternatives that 
could be employed. The choice of a language naturally limits one to the features avail- 
able in that language. Likewise, the information present in the initial data structures 
may be insufficient for effective implementation of an alternative unforeseen during the 
conceptual phase. 

These factors, guidance-and restriction, dictated that careful consideration be given 
to the fundamental design decisions. 


A.. DESIGN OBJECTIVES 
A number of desirable design objectives were identified during the conceptual phase: 
¢ Solve large sx © problems 
¢ Simple 
e Easily Understood 
e Modular 
¢ Portable 
e Fast 


Although the network formulation itself is not inherently large, the problems that 
networks represent often are. Thus, for a network solver to be applicable to a wide 
range of problems, it must be able to handle large scale networks. How large is large 
scale? The scale or size of a problem is defined by the number of nodes and arcs in the 
network. In SNET, darge is the smaller of two limits which depend upon the host com- 
puter: integer capability and available memory. 














Integer capability requires that the sum-of the number of arcs and-nodes in a_net- 
work be less than the maximum integer that the host can represent. Tyvical-maximum 
integers are 2"6-) (32,768) for a personal computer (PC) and 2@-(2,147,483,648) for a 
workstation or mainframe computer. For example, the integer capability of a-PC would 
allow it to handle any network with less than 2'§ nodes-and arcs, 

The amount of host memory, both core and virtual, available for dynamic data 
representation can also limit problem size. Each node’s dynamic representation requires 
a fixed amount of memory,.as does each arc. A certain amount of memory is also re- 
quired for program overhead. The sum of these three memory requirements-(node, aic, 
and overhead) cannot exceed the amount of data-memory available. For example, if a 
node requires 40 bytes, an arc requires 20 bytes, 4K of overhead is present, and 64K of 
host memory is available, then any combination of nodes and arcs that could be stored 
in less than 60K would be.allowable. A problem with 200 arcs and 1400 nodes would 
be acceptable, but one with 300 arcs and 1351 nodes would not. 

The smaller of these two limits, integer capability and host memory, determines the 
maximum problem size. Thus, in the example given, host memory available is the lim- 
iting factor. 

How simple arid easily understood a design is depends upon one’s education and. 
experience. SNET can be understood by individuals with moderate experience in net- 
work linear programming and a minimal knowledge of the C programming language. 
One graduate course in each should suffice. 

Simple and easily understood also implies that complex strategies will be employed 
only when there is a significant gain in performance. Thus a complicated scheme, that 
offered only modest improvement over a basic one, would not be utilized. 

Simple designs are, by definition, easily explained. Thus, even though source code 
may not be immediately legible to the uninitiated, one should be able to explain it with 
a minimum of sophisticated verbiage. This increases the odds that others may be-abie 
to contribute to an improved solver in the future. Furthermore, software maintenance 
costs can be drastically reduced. 

By keeping solver design simple, one can concentrate on principles of the network 
algorithm, rather than on the mechanics of the solver itself. The author believes that 
this philosophy offers the best chance for future improvements in algorithm develop- 
ment. 

A modular design encapsulates frequently used. or functionally unique portions of 
code into subroutines which are called without reference to their internal design. It also 




















allows different algorithms to be incorporated into the program without major restruc- 
turing. One merely exchanges ihe new algorithm module for the old module without 
concern for possible global effects. Thus modular code is easy to modify and allows 
rapid development and:testing of new algorithms. 

There is, of course, a cost for the benefits of modular.code. If the overhead of en- 
tering and exiting the moaule is substantial, considerable time could be wasted by the 
modular structure. In-this case, one would have to carefully balance the advantages of 
modular structure against its cost. 

A portaole design.can be easily transferred among many different host computers. 
A single portable program can, with minimal recoding, serve in several environments. 
The savings in programming effort are obvious. The disadvantages are not. Ifa pro- 
grammer uses only features that are completely portable, then often he cannot employ 
advanced, nonstandard features of a language that are more efficient for a particular 
computer. The author employs only portable features. 

Finally, the benefit of a fast program is obvious: problems can be solved in less 
time. 


B. PROGRAMMING LANGUAGE 

SNET is written in the-C programming language in accord1nce with tie proposed 
ANSI C standard. Optimizing a network is an intense computational problem. For 
years FORTRAN was the donunant language for such tasks. EF owever, within the last 
ten years, C has gained prominence [Ref. 9: p. 152] for intense computatioual problems 
in many communities since it has several advantages over FORTRAN. 

First, C generally executes faster than FORTRAN. Although more noticeable on 
UNIX machines, whose operating systems are written in C, this is generally true for 
most computers. Comparing assembly language generated frorn C to thai generated 
from FORTRAN, reveals that C frequently produces shorter and more efficient code 
segments (Ref. 9: pp. 155-156]. 

Second, C users can employ true pointer (memory address) var'yuis. FORTRAN 
users cannot. The components of a network, nodes and arcs, are not independent. 
Rather, they are coupled together in a determinate fashion. Fo: .2ch problems, pointers 
are a natural method of linking components tugether. Although one can implement ar- 
ray based pseudo-pointers in any language, the use -of true pointers is preferable. A 
simple example illustrates why. 











Suppese one wishes-to find the child of a node. The pointer based construct for this 

operation is 
answer = node -> child , 
whereas the array based construct is 

answer = child(node) . 

In the pointer based construct, node is a pointer to .he address of a structure (a 
collection of data variables) and child-is a member of that »': -. «re. The member, child, 
is always located at a constant offset from the beg‘nnin, ~° ..> aede structure. Given 
the location of the structure, the program immediatri; k-. +4 une location of each 
member,-and need only go to that location .and return th: \ tue of the variable lacated 
there. The process of going to a location and returning ~ ° “1ite at that location is 
known as indirection. Thus, retrieving the answer, which-*. .. * 9inte. in this case, re- 
quires one indirection. 

The array based construct requires significantly more effort. In this construct, node 
is an integer valued variable and. child-is an integer «uray. The-computer must first re- 
trieve the value of node, then using that value and the size of the array elements, calcu- 
late-the offset from the beginning of the array. It must then-add tnat offset to the array’s 
address to re.urn the required address and, finally, perferm_an indirection on -that ad- 
dress. Thus, retrieving the answer, which is an integer for the array based construct, 
requires several steps in addition to the indirection. 

Third, C allows structures, collections of (possibly dissimilar) data variables, to. be 
easily implemented. FORT?.AN has no strrcture facilities. Structures, sometimes re- 
ferred to as records in other languages, alic + related groups of variables .o be treated 
in a natural manner. This intuitive n.cthod of organizing complicated data-simplifies and 
clarifies programming requirements by allowing these groups to be handled as a unit. 
For networks, whose nodes and arcs each have scveral different types and items of in- 
formation, structures greatly simplify data retrieval and manipulation. 

Fourth, the C language protocol specifies standard libraries of functions for 
input/output, type conversion, math, memory management, and similar operations. 
These functions are, technically, not part of the C language. Rather, they provide a 
suppor’ environment within which C can efficiently work. Other languages, including 
FORTRAN, incorporate these functions inte the language itself. In-C, each compiler, 
which is usuaily machine specific, ras its own set of function libraries. Since these li- 
brary interfaces are standardized, the means. of accessing each function from within C is 
alwat’s the same, regardless of the computer used. But us -the libraries are designedfu: 


it 











each. machine, the function’s implementation. can be locaity optimized. Thus, the C 
function libraries support development .of portable programs which will run efficiently- 
on most computers. 

Fifth, C can enter and exit its function modules with minimal overhead. Thus 
modular design, breaking large, complex tasks into- smaller, simpler ones, can be used. 
without excessive fear of inefficiency in C. 


C. DATA STRUCTURES 

SNET‘s network data is-maintained-in two data structures: node and-arc. A third 
structure, the arc list element (ALE), is used when rivoting the basis. The C code 
needed to-implement .t:ese three structures is.in Figure 2. 

Both nodes and arcs have properties and valucs that are intrinsic:‘to them. Af, var- 
iatles are integer valued. The maximum value that a variable may assume dete. nines 
if it should be typed as an int (integer) or long (long integer) variable. 

Each node has a label that identifies its external association point. It may have flow. 
that is excgenous (externai) to the network. Within the basis tree, each node has a 
specific depth and a dual price that is relative tothe root node. The depth and dual price 
are likely to change as-the basis is updated. ‘ 

An arc also has several important elements. It has a cost per unit flow,.a maximum 
capacity, and often a minimum capacity. Of course, it will always have a current flow 
value, which may be zero, and a reduced cost. The latter two quantities may be fre- 
quently updated. 

However, as noted before, nodes and arcs do not exist in a vacuum. They are cou- 
pled together in a determinate fashion. SNET uses pointers to link components to- 
gether. Noite that pointers point ro other structures, not from chem. 

In SNET, eack node can be connected to three other nodes: a parent node, a sibling 
node, and a child noce. Thus the node structure contains three node pointers: parent 
(p), sibling (s), and c..ild (c). Recall that each node in the network, except the artificial 
root node, always has exactly one pareni nod2-and i: connected to: that parent node by 
exactly one arc. Thus the nede structure contains one arc pointer: parent arc (pa). 
Each node is always accounted for in the basis-spanning tree, so there-is no need to track 
it elsewhere. It is also important to.note-that each node contains sufficient information 
to enter the basis tree and trace completely through it. Therefore, these four pointers 
are sufficient to link the node into the basis tree at all times. 











. /* Structures */— 
Struct nodetype 
{ int 





long 


struct nodetype 


struct arctype 
3 
typedef struct nodetype node 


struct arctype 
{ long 


struct nodetype 


struct arctype *next 


’ 
typedef struct arctype arc 


struct arclistelemtype 


label, i.e. node nbr 
tree depth 

exogenous flow 

dual price 


* parent 


sibling 
child 
parent arc 


lower capacity 
upper capacity 
unit flow cost 
reduced cost 
flow 

head node 

tail node 

next arc in ‘out’ 


{ struct arclistelemtype “*next ; 


struct arctype *a 3 


’ 


typedef struct arclistelemtype ale ; 


Figure 2. SNET Data Structures. 





Arcs are simpler to handle. When in the basis, each arc is associated with one head 


node and one tail node. When out of the basis, the arc-will be located in.one of three 


linked lists of similar arcs, depending on whether its flow is at its lower or upper-bound, 


or the arc is being considered as a candidate for entering the basis. Thus the arc struc- 


ture contains two node pointers, head (h) and tail (t), and one arc pointer, next arc 


(next), 


It is infox.native to note that array based languages must also track the interlinking 


relationships and integer values just detailed. Generally, a set of parallel arrays is the 


only practical method for non-pointer languages to handle this information. As noted 


earlier in this chapter, parallel array operations are inefficient compared to pointer op- 


erations. Progran.s employing parallel arrays suffer a significant performance degrada- 


tion. 











The third-structure, the ALE,. identifies the-arcs-in the unique cycle ‘formed-during 
pivoting. It has two-pointers: next ALE (next), and arc (a). Cycle arcs are pointed: to 
by linked lists of ALEs, where next links-elements of the list together, and a points to 
the cycle-arc that that-element identifies. 


Another possible implementation to efficiently mark the cycle involves placing an- 


additional are pointer in the arc structure. However, this design must place that addi- 
tional pointer in each arc structure; significantly increasing the memory required to 


represent the dynamic data structures. 














IV. PROGRAM DEVELOPMENT 


The symbolic algorithm and fundamental design decision form the-structure upon 
which SNET is built. But the construction of an efficient program requires many other 
important elements: 

¢ Symbolic and other Constants 
* Global Variables 
¢ Key Functions: Collecting candidates, selecting a candidate, pivoting 


e External Files 


This chapter describes those elements. It also covers tools used to test and tune -the 
program. 


A. SYMBOLIC AND OTHER CONSTANTS 

Several constants are important for the symbolic algorithm itself, and for the C im- 
plementation of the algorithm. SNET uses two types-of constants: symbolic and pro- 
gram defined. 

The numeric values of symbolic constants are set prior to program compilation and 
are determined by the scale of expected problems and capabilities of the host computer. 
These constants are called symbolic because they can be represented by a symbol or word 
(usually denoted by capital letters in C). SNET has four symbolic constants: 
MAXNODES, MAXARCS, INFINITY, and MAXCAN., 

MAXNODES and MAXARCS are determined primarily by the scale of the ex- 
pected problems. They dimension arrays and control looping structures. MAXNODES 
must be at least one greater than the number of real nodes in the network to accom- 
modate the root node. MAXARCS must be at least one greater than the number of 
nodes plus the number of arcs in the network. Nodes are included in the count, as each 
node must initially be connected to the root node by an artificial arc. The maximum 
value of MAXNODES and MAXARCS is constrained only by the memory capacity of 
the host computer and its integer range. 

INFINITY should equal the largest long integer that the host can represent. It is 
used to prevent integer overflow and to initialize comparison variables when determining 
maximum and minimum values among sets. 
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MAXCAN serves as the upper bound for maxnbrean, the maximum number of 
candidates. In empirical studies with problems as large as 35000 arcs, 20000 has been 
effective as a bound. 

Program defined constants are set by the program at run time based upon input 
data. SNET has two critical program defined constants: Big M (bigm) and maximum 
number-of candidates (maxnbrcan). 

Big M is the cost assigned to artificial arcs. It-must be large enough to drive their 
flow to-zero if the problem can be solved. In SNET, Big M is one more than half the 
sum.of all positive real arc costs. Since any flow onan artificial arc, either to or from 
the root node, must also be carried on another artificial arc, the cost assigned by Big 
M is sufficient to drive their flow to zero. 

If the value of Big M is-too large for the host computer to represent, SNET halts 
and so informs the user. In this case, the user may either rescale arc costs or set Big M 
to a smaller value. However, when a lessor value is-chosen and the final solution has 
positive flow on one or more artificial arcs, it cannot be readily determined whether the 
problem is infeasible or the value of Big M is too small. 

maxnbrean, (sic) is, in full, the maximum number of favorable candidates encount- 
ered and considered for inclusion in the candidate queue during each-collection. A col- 
lection is a single attempt to gather favorable candidates. 

maxnbrean is initially set to MAXCAN. However, this default value may be over- 
ridden by the command line argument, percan, which is the percentage of out of basis 
candidates to be examined. In this case, maxnbrean is set equal to percan multiplied 
by the number of real arcs less the number of nodes. The number of nodes is subtracted 
srom the number of arcs before multiplication because each node (except the root node) 
requires one arc to connect it to the basis tree. 


B. GLOBAL VARIABLES 

Global variables may (potentially) be accessed anywhere in the program. Local 
variables are only available within their defining function. Thus, local variables cannot 
conflict with identically named local variables in different functions. But global variables 
can cause such conflict. Hence the use of global variables should be restricted to those 
few variables that are required by several functions. 

SNET has eight global variables: 


¢ outhi, outlo, and canque are arc pointers to the heads of three linked arc lists: 
specifically, arcs out of the basis at their upper bound, arcs out of the basis at their 
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lower bound, and arcs-in the reduced cost candidate queue. They provide the only 
access to these lists and are used frequently throughout the program. 


¢ arcs and nodes are arrays of pointers to all arcs and all nodes. They are used to set 
up the initial feasible solution, and to output the final (optimai) feasible solution. 


¢ root is a node pointer to the root node. It is used to establish the initial feasible 
solution and as a-default node to enter the basis tree. 


¢ input and output are FILE pointers to the input and output files. 


C. KEY FUNCTIONS 
Practical implementations of the symbolic algorithm often organize candidate se- 
lection and basis pivoting into three steps: 
¢ Collecting a large group of favorable arcs into a candidate queue 
¢ Getting candidates from the queue-until it is empty 


¢ Pivoting candidates, one at a time, into the basis 


SNET’s principal functions, collect_can, getcan, and pivet, implement these three 
steps. This section will describe them in sufficient-detail for the reader to understand the 
general implementation of the symbolic algorithm and its relationship to the data 
structures discussed thus far. Specific C code will not be discussed. 

1. Collecting Candidates 

Coliecting and organizing candidates is accomplished by two functions: 
collect_cau and merge. collect_can (sic) performs a collection by (usually} traversing ev- 
ery arc in the out of basis lists. As each arc is visited, collect_can calculates its reduced 
cost. If the reduced cost is favorable, it determines ifa more favorable arc with the same 
head node has already been traversed. If a more favorable arc has-not been traversed, 
the are and its predecessor in the linked list are recorded. Thus, after the traversal is 
complete, the most favorable are flowing into each node has been recorded. These fa- 
vorable arcs are then removed from their out of basis list and placed into another linked 
list, the candidate queue. The merge function [Ref 10: pn. 113-114] can be used to sort 
linked lists. collect_can uses merge and a modified binomial comb [Ref. 10: p. 264] to 
sort the candidate arcs by the absolute value of their reduced cost. After sorting, 
collect_can sets the canque pointer to the first arc in the candidate queue and returns the 
number of candidates in the queue. 

Unless instructed otherwise, collect_can will traverse the entire out of basis list. 


However, as discussed earlier, the user can set maxnbrcan, the maximum number of fa- 
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vorable candidates to examine for inclusion in the candidate queue during each col- 
lection. 

In summary, collect_can examines the out of basis lists, and builds the candidate 
queue, a sorted linked-list containing the most favorable candidate arc flowing into each 
node. 

2. Picking a Candidate 

Candidates must be selected to be pivoted into the basis until the candidate 
queue is exhausted. getcan (sic), which-performs this function, begins at the head of the 
candidate queue, indicated by ‘canque, and traverses it until a. favorable candidate is 
found or the end of the list is encountered. 

Recall that the candidate queue isa sorted linked list of arcs that were very fa- 
vorable when collect_can was invoked. However, after one or more pivots, they may not 
remain favorable. Thus, as each arc in the queue is traversed, getcan recalculates its re- 
duced cost. Ifthe reduced cost is still favorable, getcan returns a pointer to that arc and 
resets canque to the next arc in the list. If the reduced_cost is not favorable, getcan re- 
turns that arc to the appropriate out of basis list. If the candidate queue is empty, 
getcan returns a NULL pointer to the calling routine, indicating exhaustion. 

3. Pivoting the Basis 

Pivoting the basis involves three major steps: selecting a variable to exit the 
basis, adjusting the relevant variables, and updating the basis. The arc which limits the 
change in flow induced-by the incoming arc, as discussed in Chapter 2, will be the exiting 
variable. The relevant variables are the arcs on the unique cycle that the incoming arc 
forms. As flow is adjusted on them, the feasible solution approaches optimality. Since 
the basis arcs form a spanning tree, updating the basis can be equated to rehanging the 
basis tree. 

Three functions are-required to pivot the basis: the principal function, pivot, and 
two subordinate functions, mature, and calc_ddp. 

pivot (sic) first traces out the unique cycle formed by the incoming arc, newarc. 
As it traces the cycle, pivot must record two items of information on each cycle arc: 
orientation aud location. An arc’s orientation is measured relative to newarc’s. An arc 
can either flow with, in the same direction as, or against, in the opposite direction-to, 
newarc. pivot records-orientation by using two ALE lists: a with flow (wflow) list and 
an against flow (aflow) list. Location is also measured relative to newarc. A cycle arc 
must be located either-on newarc’s head side, or neware’s tail side. An arc’s location is 
stored in its next field. 











pivot begins the cycle trace by receiving newarc, the incoming arc that getcan 
Selected, from the main program. neware’s (sic) head and tail nodes are then identified. 
If both nodes are not at the same depth in the spanning tree, pivot travels up the deeper 
side, noting each arc’s location and orientati . until it reaches the node at the same 
depth as the end node of the untraversed bran. . pivot then travels up both sides of the 
cycle until it reaches the first common node. Again, each arc’s data is noted. When the 
common (joining) node has been reached, each arc in the cycle has been identified and 
classified with respect to orientation and location. 

If neware was out of the-basis-at its lower bound, then those arcs oriented in the 
same direction as neware can only increase their flow, and those opposing neware can 
only decrease their flow. Conversely, if newarc was out of the basis at its upper bound, 
those arcs oriented in the same direction as neware can only decrease tneir flow, and 
those opposing neware can only increase their flow. pivot makes this association and 
renames wflow and aflow to increase and decrease as appropriate. It then traverses both 
lists, searching each for the arcs allowing the minimum possible change. After locating 
these arcs, pivot is ready to select the exiting arc, oldarc. 

oldarc (sic), will be the arc that limits the flow change induced by newarc. pivot 
compares the flow limits imposed by newarc itself, the minimum increasing are and the 
minimum decreasing arc. The arc with the smallest limit becomes the exiting arc. If 
there is a tie in limits, pivot selects neware if possible, otherwise it selects the decreasing 
arc. The variable delta is set to the limiting quantity of the limiting arc. If delta is 
greater than zero, the increase and decrease lists are traversed and their flows adjusted 
appropriately, as is newarc’s flow. 

If the incoming arc, newarc, is not also the outgoing arc, oldare, then the basis 
tree must be rehung. Rehanging first adds newarce to, and removes oldare from, the 
spanning tree structure. Next, the stem, which consists of the cycle arcs and nodes be- 
tween neware and oldare inclusive, must be adjusted. The mature function standardizes. 
the stem by making each stem node the first child of its parent. This permits adjustment 
of the stem’s parent, child, sibling, and parent arc relationships in an efficient manner. 

Next the disposition of oldare must be resolved. Recall th.t there are two types 
of arcs in the symbolic algorithm formulation, réal and artificial. A real arc is one that 
corresponds to an existing arc in the network. An artificial arc does not correspond to 
any arc in the network. It is used to establish an initial feasible solution in the symbolic 
network formulation. If oldare is a real arc, or an artificial arc with positive flow, it will 
be added to the appropriate out of basis list. Otherwise, it is discarded. 
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pivot’s final step is to call cale_ddp which recalculates the depth and dual price 
of each rehung node. 


D. EXTERNAL (INPUT/OUTPUT) FILES 

SNET requires only two I/O files. The input file contains network data in standard 
network format. The first line of the input file indicates the number of nodes (N) in the 
problem. All subsequent lines are four-tuples, indicating tail, head, cost, and capacity 
of the arcs. Exogenous flow, either to or from a node, is indicated by exogenous arcs in 
the input file. A node’s supply is indicated by an arc coming from the source node (node 
number N+1) to that node. A node’s-demand is indicated by an arc going from that 
node to a sink node (node number N+2). The cost of these exogenous arcs is zero and 
their capacity is the amount of the exogenous flow. 

The input filename is normally designated from the command line. If the filename 
is giver. on the command line, SNET will explicitly request one. If the input file does 
not-exist or the input data is not in the correct format, SNET will so inform the user and 
then halt. 

The solution is written to the output file. The output file contains the tail, head, 
flow and cost of each active (nonzero flow) arc and the value of the objective function. 
Its filename consists of the input filename with .ans concatenated to it. 


E. TESTING/TUNING TOOLS 

Many tools are available to assist a programmer in efficient algorithm implementa- 
tion. Among the more valuable tools used in SNET’s development were debuggers, 
profilers, data snapshots, and timing routines. 

1. Debugger 

A debugger is a software package--distinct from the software being developed-- 
that can precisely control and display the execution of compiled source code. It can 
assist a programmer in locating and correcting program errors (commonly called bugs). 
Often, more time is spent locating and correcting program errors in a program than was 
spent writing it. Debuggers, by expediting the-correction process, help reduce:debugging 
time and speed the development process. 

Several useful facilities are available in most debuggers. Perhaps the-most val- 
uable debugging function is the ability to observe and change variable values during 
program exec..tion. A debugger can display values at any point in time or continuously 
during execution. It can also allow the user to change the value of a variable, regardless 
of its previous program defined value. 
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Different methods of execution control are usually available. A debugger can 
usually step through code one line or executable statement at a time. Most debuggers 
offer the option of stepping into or over subordinate functions at the user’s discretion. 
Breakpoints or execution. halt points are also a-common feature. When the user sets a 
breakpoint in the source code, the debugger will run-the program up to that point, and 
then-return execution control to the user. A few debuggers can also step through code 
in reverse order, that is, they can run the program backwards, allowing decisions and 
data manipulations to be undone. This facility allows the user to perform complex 
multiway decision tests with ease. 

Although competent debuggers offer many additional features, these two, dis- 
play and control, were sufficient to locate many obscure errors in the early SNLT code. 
The author highly recommends the use of a debugger for even moderately complex 
programs. 

2. Profiler 

A profiler measures a program’s performance and execution time. It, like the 
debugger, is also external to the source code under development. Profiler software tab- 
ulates program execution time, either by function or line; counts how many times a line 
is executed; and tracks how many times a function is called and by whom. It can also 
monitor activities external to the program, such as CPU interrupts and disk accesses. 
By monitoring critical activities and providing detailed reports on them, a profiler high- 
lights inefficient program segments and can assist the programmer in refining his code 
for efficient execution. 

Without profilers, programmers have to resort to ad hoc timing and counting 
functions that must be inserted into the source code and modified as the timing interests 
change. The information they return is neither as complete nor accurate as that pro- 
vided by the profiler. Ad hoc functions can also skew the information gathered by the 
execution resources that they consume. 

SNET’s tuning process used a profiler that gathered statistics by function. An 
example profile of an early version of SNET for the net43 problem is given in 
Figure 3. 

In this example, the fscanf function, which reads in the problem, consumes approxi- 
mately half of the program’s time. This time cannot be reduced if the decision to pro- 
gram in standard C is followed. Thus, the programmer must search elsewhere for 
improvement. The conflict between collect_can and pivot is of interest. In past primal 
simplex implementations, costing out variables was more expensive than pivoting them 
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Figure 3. 
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into the basis. As the profile indicates, this is not the case here. Thus, the programmer 
may wish to-examine these two functions in more detail. 
3. Data Snapshots 

A debugger enables the programmer to view critical variable values during pro- 
gram execution. However, sometimes this is still inadequate to correctly diagnose a 
program error. In this case, a partial or complete snapshot or dump-of data structures 
is required. The three functions, dumpnode, dumparc, and dumpale, shown in Figure 4, 
can provide a complete or partial snapshots of the data structures at any point in the 
program. 

A call to these functions with the relevant pointer will direct a snapshot to an 
external snapshot or dump file. Each function. starts at the element indicated by its in- 
coming pointer, and then recursively visits the remaining elements in the relevant struc- 
ture. dumpnode (sic) can be used to examine the basis tree. The out of basis lists are 
detailed by the dumparc function. The dumpale function can be used to examine the 
candidate queue and the cycle identified during basis pivoting. These functions were 
most useful in diagnosing the rehanging of the basis tree. 

4. Timing Routines 

In evaluating alternative implementation strategies, it is easy to be misled if one 
examines strategies on only a few problems. A better approach is to compare the per- 
formance of alternative strategies over a wide variety of problems. 

External timing routines can easily compare different programs and record their 
performance over a wide variety of problems. TESTEXEC, a timing routine used in 
SNET’s development, is shown in Appendix C. The programmer can implement the 
competing strategies in different versions of SNET and then run TESTEXEC to compare 
the programs. TESTEXEC uses the 40 standard NETGEN [Ref. 11] problems to com- 
pare program performance. 

Thus, when comparing alternative strategies, one should test them over a wide 
variety of problems. As TESTEXEC demonstrates, the power of modern computers and 
programming languages allows a hypothesis to be examined over a large sample set. 
This can only increase the strength of one’s conclusion. 


F. TESTING FOR CORRECTNESS 
SNET’s solutions were tested for correctness by comparing optimal objective func- 
tion. values to those calculated by GNET. GNET is a widely distributed program, used 
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void dumpnode(node “curr, char indent[ 80] ) 
{ -char nextindent[ 80] 
































n 


fprintf(dump, ~ n&%s[n%i p%i c%i s%i d%i pat 

%i pah%i pax%li pau%li dp%+21i", indent, 
curr->lbl, curr->p->1bl, curr->c->lbl, curr->s->1bl, 
curr->d, curr->pa->t->1lbl, curr->pa->h->lbl, 
curr->pa->x, curr->pa->u, curr->dp) ; 

strcepy(nextindent, indent) 7 


strcat(nextindent, " ") ; 

if (curr->c != NULL) dumpnode(curr->c, nextindent) ; 
if (curr->s != NULL) dumpnode(curr->s, indent) ; 
return ; 


3 


void dumparc(are *curr) 
{ while(curr != NULL) 
{ fprintf(dump, "n [t%i h%i x%li u%li x%+21i", 
curr->t->lbl, curr->h->1bl, 
curr->x, curr->u, curr->r) ; 
curr = curr->next ; 
} 
return ; 


} 


void dumpale(ale *curr) 
{ while(curr != NULL) 
{ fprintf(dump, "n [t%i h%i x%li u%li r%+21i", 
curr->a->t->lbl, curr->a->h->1bl, 
CUurre>a->x, cUrr->a->u, curr->a->r) ; 
curr = curr->next ; 


} 


return ; 


Figure 4. Data Snapshot Functions. 


in over 100 projects for 15 years, with no reported cases of incorrect solutions. No dis- 
crepancies were found. 
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VY. TUNING PARAMETER STUDY 


A primary research effort examined how characteristics or measures of the network 
problem’s topology influenced the effects of program wing parameters. The ultimate 
goal of this research was to be able to calculate some simple measure; of the problem 
and then use these measures to select the tuning parameters that could sulve the problem 
in the fastest time. 

This line of research eventually revealed that SNET is influenced oniy minimally by 
the choice of tuning parameters, let alone the characteristics of the network problem. 
Nonetheless, the author believes that the process used in the tuning parameter study 
could, with other problem measures or program tuning parameters, produce more in- 
teresting results in future experiments. This chapter describes that general research 
process. 


A. PROBLEM CHARACTERISTICS 

In selecting measures of the network problem to examine, one fact must remain 
paramount: if the calculations needed to derive the measures are too intensive, more 
time may be spent calculating them than would be required to solve the problem without 
setting the tuning parameters. Thus the measures must be simple to calculate. Prefer- 
ably, they should be derivable as the problem data is being read into the program. 

Three characteristics were studied: connectivity, exogenaity, and capacitance. 
Connectivity (conn) measures the degree of connection between a network’s nodes. It 
is defined as the number of arcs divided by the number of nodes squared. Its value can 
range from (essentially) zero to one. The connectivity of a network that is sparsely 
connected, for example a spanning tree, would be close to zero. A network that is 
completely connected, with an arc between every pair of nodes, would receive a value 
of one, 

Exogenaity (exog) measures the degree of exogenous (external) communication a 
network has with its environment. It is defined as the number of exogenous nodes 
(supply or sink nodes) divided by the total number of nodes. Its value can range from 
zero to one. The exogenaity of a network that has few supply and sink nodes would be 
close to z: zo. A transportation or assignment network, where every node is exogenous, 
would receive a value of one. 
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Capacitance-(cap) is a coarse measure of how much additional flow a network can. 
tolerate. It is-defined as the total flow from-a.i supply nodes divided by the-sum_ of the 
arc’s capacities. ‘Its range is also from zero to one, although in practical networks its 
value is likely to be less than one half. A network with a very low capacitance value-can 
probably handle. more-flow, while one with a moderate or high level will have more dif- 
ficulty accommodating increased flow. 


B. TUNING PARAMETERS 

Tuning parameters are meant to guide the behavior of a program, hopefully leading 
to improved performance. Possible tuning parameters applicable to network programs 
include the number of arcs examined during each collection, the size of the candidate 
queue, the number of candidates examined before incoming arc selection and the value 
of Big M. Although several tuning parameters may be set within SNET, this research 
examined only one, percan. 

Recall that percan, the percentage of out of basis candidates, is used- to set 
maxnbrcan, the maximum-nui.ber of favorable candidates encountered and considered 
for inclusion in the candidate queue during each collection, as follows: 


maxnbrcan = percan x (number of real arcs - number of nodes) 


maxnbrean limits the number of out of basis arcs examined during each collection 
by collect_can. 


C. EXAMPLE GENERATION 

An external driver program, DSSTEST (Appendix D), was used to generate the ex- 
amples for the study. DSSTEST uses three nested loops to vary the values of 
connectivity, exogenaity, and capacitance. Each variable assumed five different values, 
for a combination of 125 three-tuples. Each tuple was fed, via a translation routine, to 
NETGEN which generates a random.network problem whose characteristics are equal 
to the tuple values. A fourth nested loop increments percan and then calls SNET to 
solve the network problem. During the experiment percan assumed 19 different values. 
The time required for each call to SNET is recorded by DSSTEST, which then writes the 
(con, exog, cap, percan, solution time) five-tuple to an external data file. DSSTEST 
generated and recorded 2375 five-tuple examples. 

Before analysis, the example set was reduced by selecting the fastest example from 
each of the 125 problems and then deleting the solution time from each example. Thus, 
the example set used for analysis (Appendix ©) consisted of 125 four-tuples; each con- 
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taining *’:e three-tuple that represents the characteristics of the problem and the turing 
parameter that produced the fastest solution for that problem. 
It should be noted thatthe differences in solution times for each NETGEN problem 
‘ were usually minor once percan exceeded a low (roughly 30 percent) threshold. 


D. ANALYSIS 

The examples were then analyzed using two quite different approaches: inductive 
learning, and multivariate linear regression analysis. 

1. Inductive Learning 

The inductive reasoning process starts with specific examples and attempts to ; 
develop general rules. Hopefully, these generalizing rules can replicate the knowledge 
contained in-the examples in a more compact form. 

Quinian’s 1D3 induction algorithm [Ref. 12: pp. 167-173] was used to analyze 
the reduced example sect. Fhe [D3 algorithm builds-rules in the form of a-compact de- 
cision tree. At each node, [D3 examines the examples available. to it. Using them, it 
calculates how well each of the unused example characteristics predicts the parameter 
value. The characteristic that best performs this. function is marked as used and then 
: serves as the criteria for dividing the examples among the node’s children. Although 

each characteristic value can generate a child, values are usually aggregated to generate 
% the fewest children possible. This division process starts at the tree’s root node and 
continues, recursively, until all the examples assigned:to a node have the same parameter 
value, 

The decision tree generated by JJ2i (Appendix F) is not significantly mot 
compact than the original example sect. It has 101 terminal nodes. The example set 
contained 125 examples. Thus ID3’s indu. t.cn could not provide general rules given this 
example set. 

2. Regression Analysis 

Multivariate linear regression provided more fruitful results. Stepwise regiession- 
was applied to the same reduced example sct. Analysis of the output (Appendix G) re- 
vealed that cxogenaity was the only characteristic that had any statistically significant 
impact upon percan. Though unquestionably important, exogenaity could only explain 
about 50 percent of the variance in the value of the percan. 

Thus neither analysis technique could provide an adequate method of setting the 
percan parameter given the problem’s characteristics. 
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E. DYNAMIC TUNING 

Although-this research concentrated on static (unchanging) parameters and-charac- 
teristics, a_similar methodology could be employed in the analysis of dynamic. parameters 
and characteristics. A study of dynamically varying tuning parameters during program 
execution, based upon dynamic problem characteristics, may provide clues for solving 
problems faster. , 

Each of the tuning parameters given above-can be altered dynamically. However 
dynamic problem characteristics are more difficult to derive. Possible candidates for 
future study include the slope of the objective function or the percentage of flow on ar- 
tificial arcs. This is an area that deserves further study. 
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VI. CONCLUSIONS AND RECOMMENDATIONS 


A. CONCLUSIONS 

SNET is-approximately twice as fast as a primal simplex-network solver written in 
FORTRAN. SNET’s algorithm is not new in the world of network solvers, only ‘its 
method of storing and retrieving data. As discussed in Chapter 3, its pointer linked data 
structures allow information to be retrieved and manipulated with minimal computa- 
tional effort. On the 40- problem NETGEN test set, assigning equal weight to each 
problem, SNET is 127 percent faster than GNET, a primal simplex network -solver 
written.in FORTRAN, The test environment used and specific test times are provided 
in Appendices A and B. 

SNET is relatively insensitive to the tuning parameter examined. As noted in 
Chapter 5, SNET’s tuning parameter, percan, was-tested over 19 values for 125 different 
random problems. For most problems, once percan exceeded the 30 percent threshold, 
its effect on solution time was negligible. 

When new programming features become available for incorporation into a solver, 
fundamental assumptions may no longer be-valid and, therefore, should be reexamined. 
Before the use of true pointer data-structures, optimizers spoke of pivots being cheap and 
price-outs being expensive. As the profiler reveals, this is not true when pointer based 
structures maintain the data. Many examples of this maxim were encountered during 
SNET’s development. 


B. RECOMMENDATIONS 

The author has two recommendations -for future research: one concerning SNET 
and the other. dealing with-general network problems. 

First, SNET--though fast--is not a mature solver. It could benefit from improve- 
ments in many areas. Most serious though, is the number of pivots that SNET requires 
to solve a problem. A better method of selecting incoming arcs is needed to reduce the 
number of pivots. Although SNET is faster than GNET, it also requires more pivots. 
Its candidate selection method is simple and fast, but a better pivot sequence, though 
more expensive to acquire, could significantly reduce total solution time. 


Three general suggestions are offered for improving the pivot sequence. Each of 


these areas could serve as a focus for future research: 




















¢ Maintain. the out of basis arcs in parallel linked lists, with list membership deter- 
mined-by a common head or tail node. This data structure-would improve the ef- 
ficiency of search routines and allow arcs that enter or leave a specific node to be 
quickly located. 


© Research the effects of other sorting and searching methods during-candidate col- 
lection (bringing arcs into the candidate queue) and candidate selection (choosing 
an arc from the candidate queue to enter the basis). 


e Examine tuning parameter selection as a function of problem characteristics, both 
static and dynamic. 
Second, since true pointer based structures improved solver efficiency in solving 
min-cost network problems, perhaps they-can improve the efficiency of shortest path, 
min-cost spanning tree, traversals, and other network solvers. 


30 














APPENDIX A. TEST ENVIRONMENT 


SNET and GNET were tested under the same environment. All tests were run on 
a NeXT Computer, model number N9001, with version 1.0 of the NeXTSTEP operating 
- system and a Motorola 68030 25MHz CPU. 

The NETGEN problems were stored on an optical disk and then copied, one at a 
time, to a 330 Megabyte hard disk. Each-solver read the problem from the hard drive. 
The Next’s 16 Megabytes of main memory was more than sufficient for each of the 
problems and no paging was required. 

The combined problem read in and solution time of each solver was recorded for 
comparison purposes. Time used was measured by the C clock function. 

GNET is written in FORTRAN 77 and was compiled using the Absoft FORTRAN 
compiler, NeXT version 2.0, with the optimizing flag turned on. GNET’s tuning pa- 
rameters were set as recommended by its developers. 

SNET is a C program. It was compiled with the NeXT version of the GNU C 
compiler developed by the Free Software Foundation. SNET was also compiled with 


the optimizing flag turned on. Its tuning parameter, percan, was not set via the com- 
mand line argument. Therefore, by de“‘ault, maxnbrcan was set to MANCAN (20,000). 








APPENDIX B. TIME TRIALS: SNET VS. GNET 


netgen snet gnet fraction om 
1 2.53 4.92 0.51 
2 2. 80 5,34 0.52 
3 3.22 6..75 0. 48 
4 3.50 7.30 0.48 . 
5 3.86 9,22 0.42 
6 4. 66 10.58 0.44 
7 6.25 14. 67 0. 43 
8 7.05 16. 66 0.42 
9 7.56 18.48 0.41 
10 8.09 19.80 0.41 
11 3.23 6.25 0.52 
12 3.56 8.52 0.42 
13 4,61 10, 39 0.44 
14 5.45 12.78 0.43 
15 5.56 15.20 0.37 
16 2.25 4.72 0.48 
17 3. 03- 7.88 0. 38 
18 2. 36 4. 86 0.49 
19 3.11 7.59 0.41 
20 2.39 5.09 0.47 
21 3.38 8. 09 0. 39 
ee 2.34 5.00 0.47 
23 3.22 8.58 0. 38 « 
24 2.42 4.95 0.49 
25 3. 64 8.27 0.44 
26 1.95 4. 609 0.42 
27 3.22 8.09 0. 40 . 
28 4. 30 10.7 0.40 
2S 5.33 11. &8 0.45 
30 5.80 14,52 0.40 
31 6. 25 15.77 0. 40 
32 6.59 16. 70 0. 39 
33 6. 69 16, 61 0.40 
34 7.52 19.05 0, 39 
35 8.61 20. 78 0.41 
36 54. 34 123. vd 0.44 
37 55.42 105.75 0.52 
38 56. 20 125. 67 0.45 
39 41.55 81.41 0.51 
40 36.95 82.39 0.45 
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APPENDIX C. PROGRAM LISITING: TESTEXEC 


/* Time network solvers on standard NETGEN problems */ 


/* Header file */ 
#include <time.-h> 
ffinclude <stdio. h> 
#iinclude <stdlib.h> 
#include <string.h> 


/* Print an error message to the screen and stop the program “*/ 

void halt(char message[ 80] ) 

{ Beineee nHalt invoked, error in %s process. n", message) ; 
exit( 1 ) 3 


/* ceverse: reverse string s in place */ 
void reverse(char s[]) 
{ int c, i, j; 


= strlen(s) - 1; i < j3 it, j--) 


/* itoa: convert n to characters ins */ 
char *itoa(int n) 
{ int i= 0, sign ; 

char s[25] ; 


if ((sign =n) <0) n= “n 3 
do { s{itt] =n %10+ '0' ; 
} while ((n /= 10) > ee 3 
if ( sign : 0) s{itt] = 3 
s[i] =' 0' ; 
reverse(s) ; 
xeturn(s) ; 


} 


void main() 


{ int i, first, last ; 
long start, 
end ; 
float deltimel, deltime2, fraction, sum =0 ; 
char infilename[ 80] , 
outfilename[ 80] , 
code[ 25], 


command{ 80] ; 
FILE *infile,. 
*timefile ; 

















/* Open timing file ue 
if ((timefile = fopen("raw. time" "a'y) == NULL) 
halt("opening timing file") 5° 


7* Print timing file header */ : 
fprintf(timefile, "netgen snet gnet fraction n") ; 


/* Main loop */ 


/* NOTE: be sure loop values are correct before final tests */ 


first =1; 

last = 40 ; 

for (i = first; i <= last; i++) 

{ /* Get filename for ounces disk */ 
strcpy(code, itoa(i)) 
strcpy(infilename, ‘MetGenDisk/poblens/net") 5 
if (i < 10) strcat(infilename, 
strcat(infilename, code) ; 


/* Set local filename */ 
strcepy(outfilename, "net") ; 

if (i < 10) strcat(outfilename, "0") ; 
strcat(outfilename, code) ; 


/* Inform cousote we] 


print£(" pisetianeenie PROBLEM NET(L weieentiinetiie ni); 
/* Copy file from optical disk to ‘hard disk */ 

strcpy(command, "cp ") ; 

strcat(command, infilename) ; 


strcat( command, ; 
strcat(command, outfilename) ; 
system(command) ; 


/* Time pointer version of snet ay 
strepy(command, "time pointer ") ; 
strcat(command, outfilename) ; 

start =-clock(.) 3. 

system( command) : 

end = clock() ; 

deltimel = ((£loat)(end- start) / CLK_TCK) ; 
fprintf(timéfile," %8.2f ",deltimel) ; 


[* Time \GNET a 

printf(" n") ; 

rename(outfilename, "net'’) ; 
strcpy(command, “time gnet ") ; 

start = clock() ; 

system(command) ; 

end = clock() ; 

deltime2 = ((£loat)(end-start) / GLK_TCK) ; 
fprintf(timefile," %8.2f ",deltime2) ; 


/* Calculate snet as a fraction of GNET */ 
fraction = -deltimel / deltime2 ; 
fprintf(timefile," %8.2£ n" , fraction) : 
sum t= fraction ; 
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/* Delete file from hard disk */ 
remove("net") ; 





sum = sum / ((last - first) + 1); 
fprintf(timefile," n mean difference in speed %8.2f n",sum) ; 
= fclose(timefile) ; 





35 











APPENDIX.D. PROGRAM LISITING: DSSTEST 


/* Time random network problems and program parameters */ 


/* Header file */ 
#include <time. h> 
#include <stdio. h> 
#include <stdlib. h> 
#include <string. h> 


/* Print an error message to the screen and stop the program */ 

void halt(char message[ 80] ) 

{ printf(" nHalt invoked, error in %s process. n", message). ; 
exit( 1); 

} 


/* reverse: reverse string s in place */ 
void reverse(char s[{]) 
{ dint c, i, j; 


= strlen(s) - 1; i < j; itt, j--) 


/* itoa: convert n to characters in s */ 
char *itoa(int n) 

int i = 0, sign ; 

char s[25] ; 


if ((sign =n) <0) n= “n 5 
do { s[itt] =n % 10 + '0' ; 
} while ((n /= 12 2 ae 3 
if ( sign i‘ 0) s[itt] = 
s[i] = ' 0'; 
reverse(s) 3 
return(s) 3 


3 


void main() 

{ int nbr_nodes = 200, 
min_arc_cost 
max_arc_cost 
nbr_arcs, 
nbr_exog_nodes, 
nbr_sup_nodes, 
nbr_dem_nodes , 
max_up_bnd, 
min_up_bnd, 
parameter, 


100, 
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J3 
long start, 
end ; 
float tot_supply = 100000.0,; 
frac_arcs_cap = 0.75, 
epsilon = 0.0003, 
sum_arc_cap, 
-deltimn, 
con, 
exog, 
cap ; 
char infilenamef 25) , 
code[ 25], 
dummy[ 80] , 
command{ 80}]., 
scale[ 500], 
rand_nbr[] = "39962782", 
parstr[ 25] 
FILE *infile, 
*timefile ; 


/* Open timing file “ae 
if ((timefile = fopen(" cantimes" "w'')) == NULL) 
halt("opening timing file") ; 


/* Main loop */ 
/* NOTE: be sure loop values are correct before final test */ 


for (con = 0.05; con <= 0.85 + epsilon; con +=0. 20) 
{ for (exog = 0.10; exog <= 0.90 + epsilon; exog +=0. 20) 
{ for (cap = 0. 10; cap <= 0.50 + epsilon; cap +=0. 10) 
{ nbr_arcs = nbr_nodes * nbr_nodes * con ; 
nbr_exog_nodes = nbr_ nodes * exog ; 
nbr_sup_nodes = nbr_exog_nodes * 0.20 ; 
nbr_dem_nodes = nbr_exog_nodes * 0.80 ; 
sum_arc_cap = tot_supply / cap 5 
max_up_bnd = sum_arc_cap / nbr_arcs ; 
min_up_bnd = 0.25 * max_up_bnd ; 


strepy(code, itoa(con * 100)) ; 
strcat(code, itoa(exog *100)-) ; 
strcat(cede, itoa{cap %* 100)) ; 
strepy(infilename, "ni") ; 
strcat(infilename, code) ; 


if ((infile = fopen(infilename, : wy == NULL) 
hait(" opening. NETGEN ' in’ file") ; 

fprintf(infile, "%s n", rand_nbr) ; 
fprintf(infile, 

145 1%51%51%51%5i1%5i1%1011%5i%5i%5. 1£%5. 1£%10i%10i n", 

nbr_nodes, 

nbr_sup_nodes, 

nbr_dem_nodes,. 

nbr_ares, 

min_arc_cost, 

max_arc_cost, 














(long)tot_supply, 

0, 

5.0, 

100.0 * frac_arcs_cap, 

min_up_bnd, 

max_up_bnd ) ; 
fclose(infile) ; 
rename(infilename, "netin") ; 


printf(" generating problem for code %s n", code) ; 
system(' inetgen' ‘'y 3 

printf("solving problem with captran n") ; 
fprintf(timefile, "code: %s n", code) ; 


/* NOTE: correct loop values before final test */ 
for (parameter = 5; parameter <= 95; parameter += 5) 
strcepy(parstr, itoa(parameter)). ; 
strcepy(command, “captran net “)-; 
strcat(command, parstr) ; 
strcat(command, ” >out file) 3 





start = clock() ; 

system(command) ; 

end = clock() ; 

deltim = Seer Sands start) / CLK_TCK) ; 

scalej 0] = 

for (Cj7= 4s 4 a (int) Cdeltim); j++) 

streat(scale, "*' 

fprint£(timefile, "044 Ubi Chi Yd %6.2E %s na" 
(int)(con*100), Cint)Cexog*100), (int) (cap*100), 
parameter, deltim, scale) ; 


} 
fprintf(timefile, " nn") ; 


a 


printf(" n") ; 


remove("'netin" +8 
remove(' inet") i, 
remove("netrep") ; 


} 
} 


fclose(timefile) ; 
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APPENDIX E. EXAMPLE SET (REDUCED) 


This reduced example set consists of 125 four-tuples. They are the Connectivity, 
Exogenaity, and Capacitance network problem characteristics and the best (fastest) 
tuning parameter choice for 125 random problems. The example set is used as sample 
data for Induction and Regression Analysis. 


Problem | Best 
Characteristics | Parameter 


a a ee a 
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APPENDIX F. INDUCTIVE DECISION TREE 


Decision tree for reduced example set 


exog?? 
<20. 00: con?? 
<15.00:cap?? _ 
<45. 00: cap?? 
<25. 00: cap?? 


<15. 00: ~n nee eee wn en nw ee en eee enn eee 95 
>15. 00: ee wee eee ne enw nee ewe enw enen 55 
>25.00: cap?? 
<35, 00 rn ee ee ee en wn ee ween wenn 30 
>35. 00s een eee ee ee ee een eee 25 
>G5. O00: ene nee wee eee ee eee enw ew nen w wenn 35 


>15.00: cap?? 
<45. 00: cap?? 
<25. 00: con?? 
<75.00:.cap?? 
<15.-00: con? ? 


€35. 00: meee eee en een ween we wee 20 
>35. 00: con?? 
<55. 00: eee n ere enn re ene renee 5 
>55. 00: oe mee ene wr en wee eee nee 25 
>15. 00: con?? 
<35. 00: sonnet ener enn ere eee en ene 30 
>35. 00: wernt eee nen een ne nen nen 15 
>75. O00: wr wen eer ee ene enw eee eee nee 10 
>25.00: con?? 
<55. 00: een eee en ee eee ww ee ee ee eee 10 


>55. 00: con?? 
<75. 00: cap?? 


<35. 00: worn eee enn ee weno w en 5 
>35. 00: worn en re ene en wwe ne eee ene e 20 
>75. 00: enn een ween nc ewe ewe enn ne 10 
>45. 00: con?? 
$35.00: wee mmm enn en wenn ens eww nn weno nnn 20 
>35. 00: con?? 
<55. 00 ew ee ee re wre rn eee ene wee nnn 15 
>55. 00: con?? 
$75. OO: perenne wn nnn nnn nnn nner ene nn 20 
>75 00: eee eer eee eee ee ne eee enn 15 


>20. 00: exog?? 
<60. 00: exog?? 
<40. 00: cap?? 
<35. 00: con?? 
<15. 00: cap?? ; 


<15. 00: meee eee ew eee wen een nnen 30 
>15. 00: cap?? 
<25. 00; wore en ne ne ew eee een nee ene 45 
>25. 00: moc ween en enw eee wen enn wenn 15 


>15.00:con?? 
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<55. 00: con?? 
<35. 00: cap?? 


<25. 00: wen e nn enn nnn nnn eee n ene ee 35 
>25. 00: men nen enn nn enn nen ene nnn 25 
>35.00: cap?? 
<15. 00: wen enn nnn nn ne nn nn nee ene 25 
>15. 00: cap?? 
<25. 00: ener enn nnn nnn een enn nn ne 20 
>25. 00! ener en nn nw e nn wen enn nene 25 


>55.00:con?? 
<75.00: cap?? 


<15..00: ene e rw ewww ewe n ween wenn 10 
>15. 00: cap?? 
<25. 00: -n eee nnn ne een eee enn eee 30 
>25. 00: werner enn nn nn ww nn nee ee 25 
>75..00: cap?? 
<15. 00: enn nen eee ene ene nee nnn nnn 25 
>15. 00: cap?? 
25.00: en ne ee een enn ween ee 15 
>25. 00: mewn erm w mmm mmm mene 5 


>35. 00: con?? 
<35. 00: con?? 


<15. 00: ewe we enw www n wn new n ee nnn 50 
>15. 00: cap?? 
<45. 00: enn wn mwnwnneanann wena anoune 15 
>45. 00: oom en eww ee me meen we enn enw nen 50 
>35. 00: con?? 
<55, 00: cap?? 
<45. 00: -----~ Si bremeeccnedecasecse 85 
>L45. 00: mn ee een eee eee ew ne ee eee 30 


>55.00: con?? 
<75. 00: cap?? 


<45. O00: wenn ene nee nen nn wee nee n ee 15 
>45. 00: - nnn ee ne nnn eee ne ene nee eee 5 
>75. 00: ene enn nn ee nen wen eee nee ne ene 5 


>40. 00: con?? 
<35. 00: cap?? 
<45. 00: con?? 
<15. 00: cap?? 


<15. 00; ewe ee ee en ne ee eee eee 80 
>15. 00: cap?? 
S25, 00: wn ewe me ee en ee ee eee 30 
>25. 00: cap?? 
<35. O00: were w ew eee eee ee ene wee 35 
>35. 00: www ee en ewww ew ne mn eee 30 
>15.00: cap?? 
<15. 00: ------- ee o  eeee 35 
715. 00: cap?? 
<35. 00: eo en ren een een ee wenn 4S 
>35. 00: -n2 eo nn en nnn ee ee ee eee 35 
>45. 00: con?? 
“15. 00: nen eee en en ee nen eee ee ene 95 
215. 00: cme wn ene wee enw ewe n wenn enn nne 80 


>35. 00: con?? 
<55. 00: cap?? 














105: 
106: 
107: 
108: 
109: 
110: 
111: 
112: 
113: 
114: 
115: 
116: 
117: 
118: 
119: 
120: 
121: 
122: 
123: 
124: 
125: 
126: 
127: 
128: 
129: 
130: 
131: 
132: 
133: 
134: 
135: 
136: 
137: 
138: 
139: 
140: 
141: 
142: 
143: 
144: 
145: 
146: 
147: 
148: 
149: 
150: 
151: 
152: 
153: 
154: 
155: 
156: 
157: 
158: 
159: 
160: 





>45. O00: ee ene nee nnn ee we mew ew eee ns 60 
>55. 00: cap?? 
<15. 00: con?? 
<75. O00: mee een nen en ee ene n ween anne 45 
>75. 00: wn ee ee en en nn een enn wwe nne 75 


>15.00: cap?? 
<35. 00: cap??. 
625,00! -anen er sesoecsoaSeas asin 30 


>35. 00: con?? 
<75. 00: cap?? 


<45. 00: wo cee nnn ee eee enn nee ne 25 

>45. 00: en nnn ne en ne nee nee eee 30 
>75.00: cap?? 

<45. 00; wn nen one eee nen nee ene ene 30 

>45. 00: como ee nn ene new nen ween 35 


>60. 00: con?? 
<35. 00: con?? 
<15. 00: cap?? 
<35. 00: exog?? 
<80. 00: cap?? 


<15. U0: eon e nen ne nnn een nnn ee 35 
>15. 00: cap?? 

<25. 00: eee nn nn eee en nn ne ne een nee 50 

>25. 00: ---- nn ene nnn ene ewe eee ne 35 

Poi 0010 te 55 

>35. 00: cap?? 
<45. 00: cence nn enn nnn en nn ee eee nee 75 
>45. GO: ener ee ne nen nee ne wn en enw ene ene 55 


715. 00: exog?? 
<80. 09: cap?? 


£25. G0: eee nnn nen enn nn een nee nn nne- 45 
>25.00: cap?? 
$35.00: women mene ene n nme enn en ee nne 55 
>35. 00: cap?? 
<45. 00: were ne nen nner een eee ee eee 85 
>45, 00: ------- n-ne nnn ene nee 45 
>80. 00:.cap?? 
<15. 00: --2- nn nnn nn nn nnn nnn nnn 40 


>15.00: cap?? 
<45. 00: cap?? 


£25. 00: wen mene ew en eee ewe www ne 35 
>25. 00: cap?? 
<35. 00: ween een nen ee ee ene nee nee 30 
>35. 00: eo enn en none eee en eee nee 35 
>45. 00: wee ne cee wr ee ene ee ee en eee 50 


>35. 00: exog?? 
<80. 00: con?? 
<55. 00: cap?? 
<25. 00: cap?? 


<15. 00: eee en nn nn ne ern nnn enn enn ene 80 
P15. 00: wen nn nnn nen ee nen eee nee eee 30 
>25.00: cap?? 
<35. 00: wo enn ene we wen eee e nee enn eee 60 
>35. 00: cap?? 
<45. 00: --n ene nnn ene eee nn een eee 55 














161: 745.00: wo ewer nen nnn wenn nnn nnnn- 65 
162: >55.00: cap?? 

163: <25. 00: cap?? 

164: “15. 00: --wn rene n nw ne ene nnn ene nnn ee 60 
165: >15. 00: con?? 

166: $75.00: enn m enn nn nner ncn n nn nnne 75 

7 167 275.00: -n nner ne nnn nnn nn nnn nnn 55 
168: >25.-00: con?? 

169: <75.00:-cap?? 

a 170: <45. 00: enn nn nnn nnn nnn nnn nnn Hn 60 
171: 245.00: -enn ene nnn nnn een ewe nn nnnn 50 
L722 >75. 00: cap?? 

173: <45. 00! wenn n nnn nnn ene nen en nn 50 
174: 745.00: ~- nnn ner nnn nn nnn e nena nn- 60 
175: >80: 00: cap?? 

176: <25.00: cap?? 

177: <15. 00: con?? 

178: $55.00: eon nn nn nnn nner nnn nnn nnn nn n- 55 
179: >55. 00: con?? 

180: <75. 00: ennn nn ne nnn nnn nnn nme ennne 80 
181: 275.00: wren nnn nee nnn nnn newman 60 
182: >15. 00: con?? 

183: <55. 00: ene nnn nn nn nn wenn nnn new nnn 60 
184: >55.00: con?? 

185: <75. 00: en nnn nnn nee n nnn nn enn nnnnn 35 
186: 275.00: en nnn nner ener nn nen nena nen 60 
187: >25. 00: con?? 

188: <55. 00: cap?? 

’ 189: €35. 00: wore nnn nn ee nee nnn nn nnn ennne 55 
190: 235.00: ern n nnn nnn ene nrc wn nnen enna 45 
191: >55. 00: con?? 

192: <75.00: cap?? 











APPENDIX G. REGRESSION ANALYSIS 


The regression equation is 
percan = 13.5 ~ 0.0500 con + 0.542 exog + 0.058 cap 


Predictor Coef Stdev t-ratio P 
Constant 13..530 4.736 2. 86- 0.005 
con ~0. 05000 0. 05013 -1.00 0.321 


exog -0..54200 0. 05013 10. 81 0. 000 


-STEPWISE REGRESSION OF percan ON 3 PREDICTORS, WITH N= 125 
STEP 1 
‘CONSTANT 13. 02 
exog 0.542 
‘T-RATIO 10. 84 
8 15.8 
R-SQ 48. 86 
LINEAR REGRESSION OF percan ON 1 PREDICTOR exog 
The regression equation is 
percan = 13.0 + 0.542 exog 
Predictor Coef Stdev t-ratio p 
Constant 13.020 2.872 4.53 0. 000 
exog 0. 54200 0. 05000 10. 84 0. 000 
s = 15.81 R-sq = 48.9% R-sq(adj) =-48. 4% 
Analysis of Variance 
SOURCE DF SS MS F p 
Regression 1 29376 29376 117.52 0. 000 
Error 123 30747 250 
Total 124 60123 
Unusual Observations 
Obs. exog percan Fit Stdev.Fit Residual St.Resid 
2 10.0 55. 00 18. 44 2.45 36.56 2.34R 
11 50.0 80. 00 40, 12 1.41 39, 88 2.-53R 
40 50.0 80. 00 40. 12 1.41 39, 88 2.53R 
44 70.0 85.00 50.96 1.73 34.04 2.17R 
48 90.0 30. 00 61..80 2.45 -31, 80 -2.04R 
59 30.0 85.00 29, 28 1.73 55.72 3. 55R 
111 50.0 75.00 40. 1° 1.41 34, 88 2..21R 
124 90.0 95.00 61. 80 2.45 33.20 2. 13R 
LINEAR REGAGSSION OF percan ON 3 PREDICTORS, WITH N = 125 











cap 0 
"'s = 15.85 


Analysis of Var 


SOURCE DF 
Regression 3 
Error 121 
Total 124 
SOURCE DF 
con 1 
exog 1 
cap 1 


Unusual Observations 


Obs. con 

2. 5.0 
il 5.0 
40 25.0 
44 25.0 
48 25.0 
59 45.0 
111 85.0 
124 85.0 


90+ 
percan - 
60+ 

- we 

2 & 

30+ 2 

- 3 

-~ 8 

- 8 

- 2 
0+ 





. 0580 0. 1003 0.58 
R-sq = 49. 4% 
dance 
ss MS 
29710.5 9903.5 
30412.7 251.3, 
60123. 2 
SEQ SS 
250.0 
29376.4 
84.1 
percan Fit 
55.00 19. 86 3.33 
80.00 40. 96 3.17 
80.00 42. 28 2.65 
85.00 52.54 2.24 
30.00 62. 80 2.65 
85.00 29. 86 2.01 
75.00 36. 96 3.17 
95.00 60. 38- 3.33 
* 
3 
* 
4 3 
6 
2 4 
3 6 
5 2 
5 
v 
4 
ceenen peo mnenwend 
15 30 45 


39. 


0.564 


F 
40 


39. 
37. 
32. 
32. 
55. 
38.- 
34. 


R-sq(adj) = 48.2% 


P 
-0. 000 


Stdev. Fit Residual 





St. Resid 
.27R 
.51R 
41R 
O7R 
10R 
51R 
45R 
. 23R 


NIN WNNNHNN 


WW FEAW 
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